Applied Sciences |
您所在的位置:网站首页 › path planning for autonomous vehicles githubpng › Applied Sciences |
Next Article in Journal
Life Cycle Assessment of Substitutive Building Materials for Landfill Capping Systems in Vietnam
Next Article in Special Issue
New Trends in the Control of Robots and Mechatronic Systems
Previous Article in Journal
Ketone Analog of Caffeic Acid Phenethyl Ester Exhibits Antioxidant Activity via Activation of ERK-Dependent Nrf2 Pathway
Previous Article in Special Issue
Learning Human Strategies for Tuning Cavity Filters with Continuous Reinforcement Learning
Journals
Active Journals
Find a Journal
Proceedings Series
Topics
Information
For Authors
For Reviewers
For Editors
For Librarians
For Publishers
For Societies
For Conference Organizers
Open Access Policy
Institutional Open Access Program
Special Issues Guidelines
Editorial Process
Research and Publication Ethics
Article Processing Charges
Awards
Testimonials
Author Services
Initiatives
Sciforum
MDPI Books
Preprints.org
Scilit
SciProfiles
Encyclopedia
JAMS
Proceedings Series
About
Overview
Contact
Careers
News
Press
Blog
Sign In / Sign Up
Notice
clear
Notice
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader. Continue Cancel clearAll articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess. Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers. Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal. ![]() ![]() Find support for a specific problem in the support section of our website. Get Support FeedbackPlease let us know what you think of our products and services. Give Feedback InformationVisit our dedicated information section to learn more about MDPI. Get Information clear JSmol Viewer clear first_page settings Order Article Reprints Font Type: Arial Georgia Verdana Font Size: Aa Aa Aa Line Spacing: Column Width: Background: Open AccessArticle Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method by![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Abstract: Many path planning algorithms developed for land or air based autonomous vehicles no longer apply under the water. A time-optimal path planning method for autonomous underwater vehicles (AUVs), based on a Markov decision process (MDP) algorithm, is proposed for the marine environment. Its performance is examined for different oceanic conditions, including complex coastal bathymetry and time-varying ocean currents, revealing advantages compared to the A* algorithm, a traditional path planning method. The ocean current is predicted using a regional ocean model and then provided to the MDP algorithm as a priori. A computation-efficient and feature-resolved spatial resolution are determined through a series of sensitivity experiments. The simulations demonstrate the importance to incorporate ocean currents in the path planning of AUVs in the real ocean. The MDP algorithm remains robust even if the ocean current is complex. Keywords: path planning; autonomous underwater vehicles; Markov decision process; ocean current 1. IntroductionAutonomous underwater vehicles (AUVs) are deployed in the ocean to carry out specific missions, such as ocean observations, target identification, etc., due to their excellent advantages in maneuverability. In order to complete the mission, the AUV is often required to move to a target location under some constraints, such as the shortest time, shortest distance, and least energy consumption. Avoiding risks or obstacles in the ocean is also required. Aiming at different constraints, relevant path planning algorithms are developed. In this paper, the time-optimal is prioritized.The path planning algorithms for mobile robots can be divided into two categories: discrete-grid-based and sampling-based planning algorithms [1]. The former one is established on a gridded map. For example, the A* algorithm is transformed from the Dijkstra algorithm by adding a heuristic cost to enhance the computational efficiency. The A* algorithm can be modified, aiming at speeding the convergence of the A* algorithm, such as with iterative deepening A*, lifelong planning A*, and bidirectional A* algorithms [2,3]. Liu et al. [4] introduced an improved A* algorithm for generating the procedure of the normal path and berthing path when considering obstacles with currents and marine traffic. The sampling-based planning algorithm does not directly calculate global optimization on a gridded map. It uses the random scattering of particles on the map to extract map-assisted planning. The probabilistic roadmap [5] and rapidly-exploring random tree algorithms [6] are two typical examples. Recently, some clustering algorithms are also applied to path planning, for instance, the ant colony algorithm, genetic algorithm, etc. [7]. The level-set method is also an important branch of robotic path planning [8].With the rapid development of artificial intelligence, reinforcement learning is applied to robotic path planning. Julien et al. [9] presented an MDP-based planning method for a robot with wheels. In order to resolve the impact of their surrounding environment, Lou et al. [10] modeled the robotic motion as a Markov process and proposed a probabilistic-model-checking method to seek the optimal path. Singh [11] applied the object-oriented Markov decision process for indoor robots, which greatly simplifies the problem of the so-called “curse of dimensionality”. It reduces the state spaces by making the MDP properties as objects. Pereira et al. [12] used a minimum expected risk planner and a risk-aware Markov decision process to improve the reliability and safety of AUV operation in coastal regions.Different from the land or air based robots, the AUVs have their special characters due to the underwater environment. (1)Low positioning accuracy and high positioning costLand robots can directly acquire real-time positions through the global positioning system (GPS) with relatively high positioning accuracy (about 0.5 m [13]). Because of the absorption of electromagnetic waves by seawater, satellite signals cannot be directly received underwater by AUVs. Alternatively, the inertial navigation system (INS) is commonly used for underwater positioning. However, it is expensive while providing low positioning accuracy, which produces about 100 m errors after traveling by 1 km [14]. Its positioning error accumulates over time, which must be frequently calibrated for long-term underwater deployment. (2)Low cruising speed and high fault toleranceCompared with land or air based robots, the maximum speed of underwater robots is much slower, which ranges from about 3 to 10 knots [15]. Except in the harbors, marine traffic is limited. The marine morphology, such as the coastline, islands, and seamounts, is relatively fixed. Therefore, more tolerance is allowed for AUVs to tip or roll over in the ocean.(3)High impacts by the marine environmentIn the ocean, the current velocity, and the speed of AUVs are usually in the same order of magnitude. If the ocean current is neglected, it will lead to an obvious deviation between the planned and practical paths. In contrast, if a real-time robust control is applied to eliminate these deviations, it will cost extra power and computational resources.Several works have been done to solve the AUV path planning problem. Garau et al. [16] used the A* searching procedure to determine the optimal path with consideration to the ocean currents and their spatial variabilities. Zeng et al. [17] introduced a quantum-behaved particle swarm optimization algorithm for solving the optimal path planning problem of an AUV operating in environments with ocean static currents. Witt et al. [18] described a novel optimum path planning strategy for long-duration operations in environments with time-varying ocean currents. Kularatne et al. [19] presented a graph-search-based method to compute energy-optimal paths for AUVs in two-dimensional (2-D) time-varying flows. Subramani et al. [20] integrated data-driven ocean modeling with the stochastic dynamically orthogonal level-set optimization methodology to compute and study energy-optimal paths. Lolla et al. [21] predicted the time-optimal paths of autonomous vehicles navigating in any continuous, strong, and dynamic ocean currents through solving an accurate partial differential equation. Rhoads et al. [22] presented a numerical method for minimum time heading control for the underwater vehicle moving at a fixed speed in known time-varying and two-dimensional flow fields. The MDP method is suitable for the AUV underwater path planning. The MDP seeks a globally optimal solution through a value iterative method. The optimal paths of all state points in the whole domain to the target are computed only once. This is more efficient than the traditional A* algorithm [23,24], which has to repeat similar computations for every step. Because AUVs move underwater with a low cruising speed and a high fault tolerance, the actions of AUVs are limited, and thus, suitable for establishing the MDP model. Otherwise, computational difficulties will increase exponentially with increasing robotic actions. In our application, the ocean current is predicted from an oceanic forecast model, therefore the ocean currents can be regarded as fully observable. This information is provided to the AUV as a priori or in real-time through acoustic communications so that the parameters used in the MDP model can be updated. The fully observable MDP model has a faster convergence rate than the partially observable MDP models.The paper is organized as follows. The principle of the MDP path planning and its numerical algorithm for applications in the AUV navigation are introduced in Section 2. In Section 3, the efficiency of the MDP algorithm is examined and its performance is compared with the traditional A* algorithm. Then, the ocean currents predicted by a regional ocean model are incorporated into the MDP model to evaluate the performance of the MDP algorithm in a ‘real’ oceanic environment. The conclusions are presented in Section 4. 2. Path Planning Algorithm Based on the Markov Decision Process (MDP) 2.1. Markov Decision ProcessThe target region is first divided into multiple orthogonal grids. For path planning, an action that the AUV takes only depends upon the present state, not on the previous ones that it has experienced. Here, the state specifically refers to the appearance of the AUV inside a grid, rather than the movement process of the AUV itself. Since each grid usually ranges from several hundreds of meters to several kilometers, which is much larger than the actual size of the AUV (usually 1 to 10 m), the AUV has enough time and space to adjust its state inside the grid. Therefore, the AUV’s movement can be treated as a Markov process. A tuple including five parameters (S, A, { P s , s ′ a } , γ, R) is used to describe this process. Here, S is a state set, providing information of the position and velocity of the AUV. A denotes an action set, which the AUV takes to move from its present grid to neighbors. { P s , s ′ a } denotes a transition probability matrix, defined as: P s , s ′ a = P [ S t + 1 = s ′ | A t = a , S t = s ] , in which P s , s ′ a is the probability when the AUV changes from its current state s to its successor state s′ by taking an action a. Thus: ∑ s ′ P s , s ′ a = 1 and P s , s ′ a , γ ∈ [ 0 , 1 ] is a discount factor, which is used to adjust the proportion between the present and future values. In our simulation, we choose the discount factor γ = 0.95. R is a reward function which is a function of S and A, i.e., R : S × A → R . R S a denotes the reward for the AUV to take an action a in the present state s. Its value is predicted by A and S at step t, i.e., R S a = E [ R t + 1 | A t = a , S t = s ] , in which Rt+1 is the reward that the AUV gets at the next step t + 1. Thus, the Markov decision process is described as: s 0 → a 0 s 1 → a 1 s 2 → a 2 s 3 → a 3 s 4 → a 4 … The goal is to find an optimal strategy from all possible actions to maximize the expectation of the system’s total reward, i.e., max E [ R ( s 0 ) + γ R ( s 1 ) + γ 2 R ( s 2 ) + … ] . The Bellman equation [25] is used to iteratively solve the Markov decision process. When the AUV takes an action in the present state s following a strategy π, the expectation of total reward functions become: V ( s ) = E [ R ( s 0 ) + γ R ( s 1 ) + γ 2 R ( s 2 ) + … | s 0 = s , π ] = E [ R ( s 0 ) + γ ( R ( s 1 ) + γ R ( s 2 ) + … ) | s 0 = s , π ] = E [ R ( s 0 ) + γ V π ( s 1 ) | s 0 = s , π ] , in which R(s) indicates the reward in the current state s. 2.2. Value Iteration MethodThe general solution of the MDP algorithm can be obtained through the value iteration or strategy iteration [26]. The value iteration method can give the optimal strategy from an arbitrary point to the target, which is suitable for the real-time control of AUVs. Its abbreviated process is described below.In Algorithm 1, we use the greedy strategy to compute Vn(s′), i.e., assuring the maximum reward in every iteration. And then the Bellman Equation (5) is used to solve V(s). π(s) is solved using the value iteration method. It turns out that the solution is convergent to the optimal strategy π*(s) [27]. Algorithm 1. The Value Iteration in the MDP AlgorithmINPUT: MDP five-tuple (S, A, { P s , s ′ a } , γ, R), max iteration number N, deviation εITERATION: ∀ s ∈ S , V 0 ( s ) = 0 for n in range (1, N):for each state s do: V n + 1 ( s ) = max a ∈ A ∑ s ′ P ( s ′ | s , a ) ( R ( s , a ) + γ V n ( s ′ ) ) if ∀ s | V n + 1 ( s ) − V n ( s ) | CrossRef]Egbert, G.D.; Erofeeva, S.Y. Efficient inverse modeling of barotropic ocean tides. J. Atmos. Ocean. Technol. 2002, 19, 183–204. [Google Scholar] [CrossRef][Green Version]Fu, B.; Chen, L.; Zhou, Y.; Zheng, D.; Wei, Z.; Dai, J.; Pan, H. An improved A* algorithm for the industrial robot path planning with high success rate and short length. Robot. Auton. Syst. 2018, 106, 26–37. [Google Scholar] [CrossRef]Bai, A.; Wu, F.; Chen, X. Online planning for large MDPs with MAXQ decomposition. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 3, International Foundation for Autonomous Agents and Multiagent Systems, Valencia, Spain, 4–8 June 2012; pp. 1215–1216. [Google Scholar] Shu, M.; Zheng, X.; Li, F.; Wang, K.; Li, Q. Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method. Appl. Sci. 2022, 12, 3064. https://doi.org/10.3390/app12063064 AMA StyleShu M, Zheng X, Li F, Wang K, Li Q. Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method. Applied Sciences. 2022; 12(6):3064. https://doi.org/10.3390/app12063064 Chicago/Turabian StyleShu, Mingrui, Xiuyu Zheng, Fengguo Li, Kaiyong Wang, and Qiang Li. 2022. "Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method" Applied Sciences 12, no. 6: 3064. https://doi.org/10.3390/app12063064 Find Other Styles Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here. Article Metrics No No Article Access Statistics For more information on the journal statistics, click here. Multiple requests from the same IP address are counted as one view. Zoom | Orient | As Lines | As Sticks | As Cartoon | As Surface | Previous Scene | Next Scene Cite Export citation file: BibTeX | EndNote | RIS MDPI and ACS StyleShu, M.; Zheng, X.; Li, F.; Wang, K.; Li, Q. Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method. Appl. Sci. 2022, 12, 3064. https://doi.org/10.3390/app12063064 AMA StyleShu M, Zheng X, Li F, Wang K, Li Q. Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method. Applied Sciences. 2022; 12(6):3064. https://doi.org/10.3390/app12063064 Chicago/Turabian StyleShu, Mingrui, Xiuyu Zheng, Fengguo Li, Kaiyong Wang, and Qiang Li. 2022. "Numerical Simulation of Time-Optimal Path Planning for Autonomous Underwater Vehicles Using a Markov Decision Process Method" Applied Sciences 12, no. 6: 3064. https://doi.org/10.3390/app12063064 Find Other Styles Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here. clear Appl. Sci., EISSN 2076-3417, Published by MDPI RSS Content Alert Further Information Article Processing Charges Pay an Invoice Open Access Policy Contact MDPI Jobs at MDPI Guidelines For Authors For Reviewers For Editors For Librarians For Publishers For Societies For Conference Organizers MDPI Initiatives Sciforum MDPI Books Preprints.org Scilit SciProfiles Encyclopedia JAMS Proceedings Series Follow MDPI LinkedIn Facebook Twitter![]() |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |